Indirect Symbolic Correlation Apporoach to Unsegmented Text Recognition
نویسندگان
چکیده
During the last twenty years, most recognition engines for difficult to segment scripts have been built around Hidden Markov Models (HMMs). Parametric recognizers for unsegmented signals, like HMMs, are hard to train. In contrast, non-parametric classifiers, like Nearest-Neighbor, require only a labeled reference list. In this paper, we provide preliminary results in support of an entirely new method for non-parametric classification of unsegmented text. Indirect symbolic correlation is a general method for bringing lexical context into the recognition of unsegmented signals that represent words or phrases in printed form. It is applicable wherever segments of lexically labeled reference signals can be compared to unlabeled signals. The signal need only preserve the ordering of the alphabetic units within a word (and of words within a phrase).
منابع مشابه
Match graph generation for symbolic indirect correlation
Symbolic indirect correlation (SIC) is a new approach for bringing lexical context into the recognition of unsegmented signals that represent words or phrases in printed or spoken form. One way of viewing the SIC problem is to find the correspondence, if one exists, between two bipartite graphs, one representing the matching of the two lexical strings and the other representing the matching of ...
متن کاملOnline Handwriting Recognition Using Time -Order of Lexical and Signal Co-Occurrences
Letter-polygram based Symbolic Indirect Correlation is a new method that offers significant advantages for ordered unsegmented signals. However, its application to on -line, cursive handwriting requires solving several difficult problems. (1) Reference strings of words must satisfy certa in uniformity properties on their lexical match with the lexicon of unknown words. (2) The Viterbi algorithm...
متن کاملPrototype Extraction and Adaptive OCR
ÐTo maintain OCR accuracy with decreasing quality of page image composition, production, and digitization, it is essential to tune the system to each document. We propose a prototype extraction method for document-specific OCR systems. The method automatically generates training samples from unsegmented text images and the corresponding transcripts. It is tolerant of transcription errors, so a ...
متن کاملExperience of Symbolic Violence among Immigrant Student’s and Its Psychosocial Consequences
Introduction: Migration, in addition to social, cultural, and economic consequences, undoubtedly has different psychological consequences for immigrants. These consequences can have a huge impact on their mental health. Therefore, the purpose of this study was to investigate the experience of symbolic violence and its psycho-social consequences among immigrant students and to determine its fina...
متن کاملWhy does removing inter-word spaces produce reading deficits? The role of parafoveal processing.
To examine the role of inter-word spaces during reading, we used a gaze-contingent boundary paradigm to manipulate parafoveal preview (i.e., valid vs. invalid preview) in a normal text condition that contained spaces (e.g., "John decided to sell the table") and in an unsegmented text condition that contained random numbers instead of spaces (e.g.,"John4decided8to5sell9the7table"). Preview effec...
متن کامل